voice command
These New Smart Glasses From Ex-OnePlus Engineers Have a Hidden Cost
The Kickstarter-funded glasses from L'Atitude 52 N have AI features bundled for 1 year, but the company doesn't know yet how much it will charge for access after that. Lots of smart glasses have AI bots inside them now. The one in L'Atitude 52 N's glasses is called Goya, named after Francisco Goya, the famous Spanish artist who painted renowned masterpieces of romanticism. CEO and founder Gary Chen, who has worked on wearable devices for companies like Oppo, OnePlus, and HTC, says his company's glasses are focused on travelers, with AI features that act like a tour guide and talk about all the paintings in famous museums. "Basically, you can say, 'Hey, Goya, what is the story about Mona Lisa?'" Chen says. "You can ask anything and, with your permission, they will take a photo to analyze what's in front of you."
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
- Asia > Middle East > Iran (0.04)
- Information Technology > Human Computer Interaction > Interfaces (1.00)
- Information Technology > Hardware (1.00)
- Information Technology > Communications (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Meet the AI-powered robotic dog ready to help with emergency response
Developed by Texas A&M University engineering students, this AI-powered robotic dog doesn't just follow commands. Designed to navigate chaos with precision, the robot could help revolutionize search-and-rescue missions, disaster response and many other emergency operations. Sandun Vitharana, an engineering technology master's student, and Sanjaya Mallikarachchi, an interdisciplinary engineering doctoral student, spearheaded the invention of the robotic dog. It can process voice commands and uses AI and camera input to perform path planning and identify objects. A roboticist would describe it as a terrestrial robot that uses a memory-driven navigation system powered by a multimodal large language model (MLLM).
- North America > United States > Texas (0.28)
- North America > United States > Ohio (0.05)
- Europe > Switzerland > Zürich > Zürich (0.05)
- Asia > Kazakhstan (0.05)
These appliances don't depend on smart speakers for voice control
When you purchase through links in our articles, we may earn a small commission. These appliances don't depend on smart speakers for voice control Emerson Smart's new appliances respond to voice commands, but they don't need a smart speaker--or even a broadband connection--to pull off the trick. Smart appliances that can be controlled with voice commands are nothing new, but IAI Smart is showing a new line of Emerson Smart appliances at CES that respond to voice commands. They don't need a smart speaker in the middle, and they don't rely on a broadband connection, an app, or anything other infrastructure--everything is processed locally. If you're leery of the privacy and security vulnerabilities of IoT devices, this could be the answer.
- Information Technology > Security & Privacy (1.00)
- Information Technology > Internet of Things (0.99)
- Information Technology > Communications > Networks (0.97)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.61)
Multimodal "Puppeteer": Exploring Robot Teleoperation Via Virtual Counterpart with LLM-Driven Voice and Gesture Interaction in Augmented Reality
Zhang, Yuchong, Orthmann, Bastian, Ji, Shichen, Welle, Michael, Van Haastregt, Jonne, Kragic, Danica
The integration of robotics and augmented reality (AR) offers promising opportunities to enhance human-robot interaction (HRI) by making teleoperation more transparent, spatially grounded, and intuitive. We present a head-mounted AR "puppeteer" framework in which users control a physical robot via interacting with its virtual counterpart robot using large language model (LLM)-driven voice commands and hand-gesture interaction on the Meta Quest 3. In a within-subject user study with 42 participants performing an AR-based robotic pick-and-place pattern-matching task, we compare two interaction conditions: gesture-only (GO) and combined voice+gesture (VG). Our results show that GO currently provides more reliable and efficient control for this time-critical task, while VG introduces additional flexibility but also latency and recognition issues that can increase workload. We further explore how prior robotics experience shapes participants' perceptions of each modality. Based on these findings, we distill a set of evidence-based design guidelines for AR puppeteer metaphoric robot teleoperation, implicating multimodality as an adaptive strategy that must balance efficiency, robustness, and user expertise rather than assuming that additional modalities are universally beneficial. Our work contributes empirical insights into how multimodal (voice+gesture) interaction influences task efficiency, usability, and user experience in AR-based HRI.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (0.67)
- Health & Medicine > Health Care Technology (0.46)
- Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
I Ditched Alexa and Upgraded My Smart Home
Here's how I cut down my family's reliance on Alexa. Until recently, my smart home setup was in chaos. After years of testing, buying, and upgrading to the latest smart home gadgets in an attempt to make my life easier, it became a bloated mess that was actually making it more complicated. My Alexa, Google Home, and Apple Home apps were awash with dead devices, duplicates, and automations that simply didn't work. My Hue Bridge, trying desperately to tie it all together, was creaking at the seams.
- Asia > Nepal (0.14)
- North America > United States > California (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction
-- Air traffic controllers (ATCOs) issue high - intensity voice commands in dense airspace, where accurate workload modeling is critical for safety and efficiency. This paper proposes a multimodal deep learning framework that integrates structured data, trajectory sequences, and image features to estimate two key parameters in the ATCO command lifecycle: the time offset between a command and the resulting aircraft maneuver, and the command duration. A hi gh - quality dataset was constructed, with maneuver points detected using sliding window and histogram - based methods. A CNN - Transformer ensemble model was developed for accurate, generalizable, and interpretable predictions. By linking trajectories to voice commands, this work offers the first model of its kind to support intelligent command generation and provides practical value for workload assessment, staffing, and scheduling. A. Background As global air traffic demand increases, airspace operations have become more complex and congested, presenting major challenges for air traffic control (ATC) systems. Although surveillance and communication technologies have improved, ATC performance still largely depends on human operators, particularly air traffic controllers (ATCOs), who monitor flights, assess conditions, and issue maneuver instructions to ensure safe and efficient operations. This human bottleneck has become a key constraint on ATC efficiency and safety, emphasizing the importance of quantifying task intensity and evaluating workload to support fatigue management, staff scheduling, and the development of in telligent ATC solutions . Early studies on ATCO workload modeling primarily focused on statistical methods and subjective assessments such as NASA Task Load Index (NASA - TLX) [1] .
- South America > Brazil (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (4 more...)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Air (1.00)
These eye-popping smart lights boast built-in AI microphones
Smart lights that react to voice commands spoken to smart speakers are old hat, but a smart light with a built-in AI microphone? Showing off its wares at the IFA trade show in Berlin this week, Germany-based smart device manufacturer Lepro is teeing up a quartet of "AI Lighting Pro" lights that can set the mood based on your natural-language prompts--anything from "Give me an Iron Man vibe" to "Set a cyberpunk city theme." Each of the lights features a built-in microphone that captures your commands (you must say the "Hey Lepro" wake phrase first) and processes them using Lepro's new LightGPM AI engine, a large language model that's trained on "color psychology and lighting design," Lepro says. The AI then delivers an "ideal" multi-color lighting scene based on your voice prompt. We've seen plenty of smart lights with AI-powered light scene bots before; Philips Hue is integrating one into the Hue app, and Govee and Nanoleaf have their own versions.
- Europe > Germany (0.26)
- North America (0.06)
Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands
Qi, Kristin, Zhu, Youxiang, Summerour, Caroline, Batsis, John A., Liang, Xiaohui
Early detection of cognitive decline is crucial for enabling interventions that can slow neurodegenerative disease progression. Traditional diagnostic approaches rely on labor-intensive clinical assessments, which are impractical for frequent monitoring. Our pilot study investigates voice assistant systems (VAS) as non-invasive tools for detecting cognitive decline through longitudinal analysis of speech patterns in voice commands. Over an 18-month period, we collected voice commands from 35 older adults, with 15 participants providing daily at-home VAS interactions. To address the challenges of analyzing these short, unstructured and noisy commands, we propose Cog-TiPRO, a framework that combines (1) LLM-driven iterative prompt refinement for linguistic feature extraction, (2) HuBERT-based acoustic feature extraction, and (3) transformer-based temporal modeling. Using iTransformer, our approach achieves 73.80% accuracy and 72.67% F1-score in detecting MCI, outperforming its baseline by 27.13%. Through our LLM approach, we identify linguistic features that uniquely characterize everyday command usage patterns in individuals experiencing cognitive decline.
- North America > United States > North Carolina (0.28)
- North America > United States > Massachusetts (0.28)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Dementia (0.49)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.47)
How to take photos on your phone via remote control
Breakthroughs, discoveries, and DIY tips sent every weekday. Our smartphones have transformed the way we take photos and videos and our relationship to these digital memories. Most of us will snap at least some pictures and clips every day with the gadget that's always close at hand. If you want to get more creative with photos on your phone, you can. Sometimes you're going to want to take a picture remotely, without your phone in your hand and your finger over the shutter button--maybe you're taking a wide shot of a large group, or you want to capture a lot of your surroundings.
- Information Technology > Communications > Mobile (0.78)
- Information Technology > Artificial Intelligence (0.70)
Reducing Latency in LLM-Based Natural Language Commands Processing for Robot Navigation
Pollini, Diego, Guterres, Bruna V., Guerra, Rodrigo S., Grando, Ricardo B.
The integration of Large Language Models (LLMs), such as GPT, in industrial robotics enhances operational efficiency and human-robot collaboration. However, the computational complexity and size of these models often provide latency problems in request and response times. This study explores the integration of the ChatGPT natural language model with the Robot Operating System 2 (ROS 2) to mitigate interaction latency and improve robotic system control within a simulated Gazebo environment. We present an architecture that integrates these technologies without requiring a middleware transport platform, detailing how a simulated mobile robot responds to text and voice commands. Experimental results demonstrate that this integration improves execution speed, usability, and accessibility of the human-robot interaction by decreasing the communication latency by 7.01\% on average. Such improvements facilitate smoother, real-time robot operations, which are crucial for industrial automation and precision tasks.
- South America > Uruguay (0.04)
- South America > Argentina (0.04)
- Asia > South Korea > Daegu > Daegu (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)